Novel Neural Network Prediction Systems for Human Promoters and Splice Sites

نویسنده

  • Martin G. Reese
چکیده

We present a detailed theoretical study of the organization and structure of landmark sequences like promoters and splice junctions in Human DNA. An improved detection of these landmark sequences in genomic DNA is important for exon detection and gene assembly. The function of eukaryotic promoters as initiators for transcription and of splice sites as signals for RNA assembly are among of the most complex processes in molecular biology. Both consist of multiple functional sites in primary DNA that are involved in the polymerase binding and splicing process, respectively. We analyzed the structure of the individual elements within promoters and splice sites using a novel technique that combines neural networks with weight pruning. For a complete promoter site prediction we combine these single predictions for each element using time-delay neural networks (TDNN). TDNNs are appropriate for recognizing promoter elements because they are able to combine multiple features, even those that appear at different relative positions in different sequences. Another advantage is the high selectivity of the TDNN, which is extremely important for promoter prediction systems, that are known to have high amount of false positive classifications. This TDNN predicts most of the annotated promoters in a set of human genes from Genbank (version 86.0). As an example, the TDNN finds the annotated promoter from a 13,865 basepair test gene, HUMTFPB, with a false positive score of 0.05% (6 false positive predictions out of 13,865). On a test set containing 40 known human gene promoters and 1000 random DNA sequences we were able to recognize 50% of the human gene promoters with a false positive classification of 0.8% (correlation coefficient of 0.61). Preliminary tests using these pruned neural networks for splice site predictions show very promising results. In the future we expect to improve performance by combining gene-finding prediction methods with our local signal predictors like the promoter and splice site networks to reduce the false positive predictions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrated Model of DNA Sequence Numerical Representation and Artificial Neural Network for Human Donor and Acceptor Sites Prediction

Human Genome Project has led to a huge inflow of genomic data. After the completion of human genome sequencing, more and more effort is being put into identification of splicing sites of exons and introns (donor and acceptor sites). These invite bioinformatics to analysis the genome sequences and identify the location of exon and intron boundaries or in other words prediction of splicing sites....

متن کامل

Prediction of Driver’s Accelerating Behavior in the Stop and Go Maneuvers Using Genetic Algorithm-Artificial Neural Network Hybrid Intelligence

Research on vehicle longitudinal control with a stop and go system is presently one of the most important topics in the field of intelligent transportation systems. The purpose of stop and go systems is to assist drivers for repeatedly accelerate and stop their vehicles in traffic jams. This system can improve the driving comfort, safety and reduce the danger of collisions and fuel consumption....

متن کامل

AnG-HPR: Analysis of n-Gram based human Promoter Recognition

We describe a promoter recognition method named An-HPR to locate eukaryotic promoter regions and predict transcription start sites (TSSs). We computed n-gram features are extracted and used in promoter prediction. We computed n-grams (n=2, 3, 4, 5) as features and created frequency features to extract informative and discriminative features for effective classification. Neural network classifie...

متن کامل

The Use of Fundamental Color Stimulus to Improve the Performance of Artificial Neural Network Color Match Prediction Systems

In the present investigation attempts were made for the first time to use the fundamental color stimulus as the input for a fixed optimized neural network match prediction system. Four sets of data having different origins (i.e. different substrate, different colorant sets and different dyeing procedures) were used to train and test the performance of the network. The results showed that th...

متن کامل

Prediction of human mRNA donor and acceptor sites from the DNA sequence.

Artificial neural networks have been applied to the prediction of splice site location in human pre-mRNA. A joint prediction scheme where prediction of transition regions between introns and exons regulates a cutoff level for splice site assignment was able to predict splice site locations with confidence levels far better than previously reported in the literature. The problem of predicting do...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995